Multi-view Recurrent Neural Acoustic Word Embeddings
نویسندگان
چکیده
Recent work has begun exploring neural acoustic word embeddings—fixeddimensional vector representations of arbitrary-length speech segments corresponding to words. Such embeddings are applicable to speech retrieval and recognition tasks, where reasoning about whole words may make it possible to avoid ambiguous sub-word representations. The main idea is to map acoustic sequences to fixed-dimensional vectors such that examples of the same word are mapped to similar vectors, while different-word examples are mapped to very different vectors. In this work we take a multi-view approach to learning acoustic word embeddings, in which we jointly learn to embed acoustic sequences and their corresponding character sequences. We use deep bidirectional LSTM embedding models and multi-view contrastive losses. We study the effect of different loss variants, including fixed-margin and cost-sensitive losses. Our acoustic word embeddings improve over previous approaches for the task of word discrimination. We also present results on other tasks that are enabled by the multi-view approach, including cross-view word discrimination and word similarity.
منابع مشابه
Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings
Query-by-example search often uses dynamic time warping (DTW) for comparing queries and proposed matching segments. Recent work has shown that comparing speech segments by representing them as fixed-dimensional vectors — acoustic word embeddings — and measuring their vector distance (e.g., cosine distance) can discriminate between words more accurately than DTW-based approaches. We consider an ...
متن کاملLarge Scale Multi-label Text Classification with Semantic Word Vectors
Multi-label text classification has been applied to a multitude of tasks, including document indexing, tag suggestion, and sentiment classification. However, many of these methods disregard word order, opting to use bag-of-words models or TFIDF weighting to create document vectors. With the advent of powerful semantic embeddings, such as word2vec and GloVe, we explore how word embeddings and wo...
متن کاملCUFE at SemEval-2016 Task 4: A Gated Recurrent Model for Sentiment Classification
In this paper we describe a deep learning system that has been built for SemEval 2016 Task4 (Subtask A and B). In this work we trained a Gated Recurrent Unit (GRU) neural network model on top of two sets of word embeddings: (a) general word embeddings generated from unsupervised neural language model; and (b) task specific word embeddings generated from supervised neural language model that was...
متن کاملAcoustic Word Embeddings for ASR Error Detection
This paper focuses on error detection in Automatic Speech Recognition (ASR) outputs. A neural network architecture is proposed, which is well suited to handle continuous word representations, like word embeddings. In a previous study, the authors explored the use of linguistic word embeddings, and more particularly their combination. In this new study, the use of acoustic word embeddings is exp...
متن کاملOn the Use of Word Embeddings Alone to Represent Natural Language Sequences
To construct representations for natural language sequences, information from two main sources needs to be captured: (i) semantic meaning of individual words, and (ii) their compositionality. These two types of information are usually represented in the form of word embeddings and compositional functions, respectively. For the latter, Recurrent Neural Networks (RNNs) and Convolutional Neural Ne...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1611.04496 شماره
صفحات -
تاریخ انتشار 2016